Safe Exploitation of Predictions of Opponent Behavior
ثبت نشده
چکیده
Given a prediction of opponent behavior in a general-sum two-player normal form game, it is difficult to select a strategy that balances the opportunity to use the prediction to inform one’s action with the risk of becoming predictable. We propose Restricted Stackelberg Response with Safety (RSRS), a novel way of generating such a strategy. RSRS uses an r-safe Stackelberg equilibrium in a modified game. We describe an algorithm which selects parameter values for RSRS to produce strategies that can play well against the prediction, respond to a best-responding opponent, or guard against worst-case outcomes. We have tested the algorithm on multiple general-sum games against different opponents.
منابع مشابه
Partial Best Response: Balancing Exploitation and Safety
We propose Partial Best Response as a method for an agent playing a normal form game to balance the rewards of exploiting a projection of opponent behavior with the risks of being exploited by the opponent. If an agent best responds to a prediction of opponent behavior, it will open itself up to the possibility of exploitation. If it avoids exploitation by playing a Nash equilibrium, it can’t b...
متن کاملHow to safely exploit predictions in general-sum normal form games
Given a general-sum normal form game and a prediction of opponent behavior it is difficult to select a strategy which balances the opportunity to exploit the prediction with the risk of being exploited. We propose Restricted Stackelberg Response with Safety (RSRS), a novel way of generating such a strategy. RSRS uses an r-safe Stackelberg equilibrium in a modified game, which is created to refl...
متن کاملHow to Safely Exploit Predictions in General-Sum Normal Form Games
Given a prediction of opponent behavior in a generalsum two-player normal form game, it is difficult to select a strategy that balances the opportunity to use the prediction to inform one’s action with the risk of becoming predictable. We propose Restricted Stackelberg Response with Safety (RSRS), a novel way of generating such a strategy. RSRS uses an r-safe Stackelberg equilibrium in a modifi...
متن کاملHow to Safely Exploit Predictions in General-Sum Normal Form Games
Given a prediction of opponent behavior in a generalsum two-player normal form game, it is difficult to select a strategy that balances the opportunity to use the prediction to inform one’s action with the risk of becoming predictable. We propose Restricted Stackelberg Response with Safety (RSRS), a novel way of generating such a strategy. RSRS uses an r-safe Stackelberg equilibrium in a modifi...
متن کاملExploitation and Safety in General Sum Games
We describe a method for an agent playing a generalsum normal form game to balance the rewards of exploiting a prediction of opponent behavior with the risks of being exploited by a self-interested opponent while guaranteeing a worst-case safety margin. Our algorithm, Restricted Stackelberg Response with Safety, calculates a probability distribution over the agent’s moves that balances those co...
متن کامل